22 research outputs found

    Bridging the Gap Between the Least and the Most Influential Twitter Users

    Get PDF
    Social networks play an increasingly important role in shaping the behaviour of users of the Web. Conceivably Twitter stands out from the others, not only for the platform's simplicity but also for the great influence that the messages sent over the network can have. The impact of such messages determines the influence of a Twitter user and is what tools such as Klout, PeerIndex or TwitterGrader aim to calculate. Reducing all the factors that make a person influential into a single number is not an easy task, and the effort involved could become useless if the Twitter users do not know how to improve it. In this paper we identify what specific actions should be carried out for a Twitterer to increase their influence in each of above-mentioned tools applying, for this purpose, data mining techniques based on classification and regression algorithms to the information collected from a set of Twitter users.This work has been partially founded by the European Commission Project ”SiSOB: An Observatorium for Science in Society based in Social Models” (http://sisob.lcc.uma.es) (Contract no.: FP7 266588), ”Sistemas Inalámbricos de Gestión de Información Crítica” (with code number TIN2011-23795 and granted by the MEC, Spain) and ”3DTUTOR: Sistema Interoperable de Asistencia y Tutoría Virtual e Inteligente 3D” (with code number IPT-2011-0889- 900000 and granted by the MINECO, Spain

    Nuevos enfoques en aprendizaje incremental

    Get PDF
    Actualmente el volumen de datos que se genera en diferentes ámbitos es muy elevado, llegando incluso a ser difícil de almacenar. Realizar tareas de aprendizaje automático ante tal cantidad de información está provocando que sean necesarios nuevos algoritmos. En esta tesis se presentan distintas aportaciones en el ámbito del aprendizaje incremental, las cuales, fundamentalmente, están dirigidas a mejorarlo usando algoritmos basados en cotas de concentración y sistemas multiclasificadores

    GNUsmail: Open framework for on-line email classification

    Get PDF
    Real-time classification of massive email data is a challenging task that presents its own particular difficulties. Since email data presents an important temporal component, several problems arise: emails arrive continuously, and the criteria used to classify those emails can change, so the learning algorithms have to be able to deal with concept drift. Our problem is more general than spam detection, which has received much more attention in the literature. In this paper we present GNUsmail, an open-source extensible framework for email classification, which structure supports incremental and on-line learning. This framework enables the incorporation of algorithms developed by other researchers, such as those included in WEKA and MOA. We evaluate this framework, characterized by two overlapping phases (pre-processing and learning), using the ENRON dataset, and we compare the results achieved by WEKA and MOA algorithms

    Mining Web-based Educational Systems to Predict Student Learning Achievements

    Get PDF
    Educational Data Mining (EDM) is getting great importance as a new interdisciplinary research field related to some other areas. It is directly connected with Web-based Educational Systems (WBES) and Data Mining (DM, a fundamental part of Knowledge Discovery in Databases). The former defines the context: WBES store and manage huge amounts of data. Such data are increasingly growing and they contain hidden knowledge that could be very useful to the users (both teachers and students). It is desirable to identify such knowledge in the form of models, patterns or any other representation schema that allows a better exploitation of the system. The latter reveals itself as the tool to achieve such discovering. Data mining must afford very complex and different situations to reach quality solutions. Therefore, data mining is a research field where many advances are being done to accommodate and solve emerging problems. For this purpose, many techniques are usually considered. In this paper we study how data mining can be used to induce student models from the data acquired by a specific Web-based tool for adaptive testing, called SIETTE. Concretely we have used top down induction decision trees algorithms to extract the patterns because these models, decision trees, are easily understandable. In addition, the conducted validation processes have assured high quality models

    StreetQR Project. Device for Information Assistance in Streets and Places of Interest

    Get PDF
    En este trabajo se expone un ejemplo de transferencia de conocimiento desde la universidad hacia la sociedad, dentro del campo de la Inteligencia Artificial, con vista a obtener un encadenamiento productivo universidad-empresa. Así, se describe el proyecto StreetQR, cuyo objetivo es implementar el dispositivo de dicho nombre en el campus de la Universidad de Málaga, y que está actualmente en desarrollo. El StreetQR es un dispositivo de asistencia informativa para placas de calle y lugares de interés, que permite tres funciones: informar de manera situacional a los ciudadanos que están en una ciudad, captar información del flujo vehicular y peatonal de dicha ciudad, y alertar a la población en caso de situaciones especiales. En el trabajo se explica el dispositivo y su funcionamiento, así como el marco institucional que ha ofrecido la Universidad de Málaga para poder pasar de una patente a un proyecto que tiene por objetivo obtener un prototipo funcional del dispositivo en el campus universitario. También se expondrá el estado actual de desarrollo del proyecto.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec

    Educational Data Mining for Personalized Prediction of Academic Performance

    Get PDF
    La Minería de Datos Educativos (Educational Data Mining - EDM) está adquiriendo gran importancia como un nuevo campo de investigación interdisciplinario relacionado con algunas otras áreas. Está directamente relacionado con los Sistemas Educativos basados en la Web (Web-based Educational Systems - WBES) y la Minería de Datos (Data Mining - DM), siendo esta última una parte fundamental del Descubrimiento de Conocimiento en Bases de Datos (Knowledge Discovery in Databases - KDD). Los WBES almacenan y administran grandes cantidades de datos. Estos datos están creciendo cada vez más y contienen conocimientos ocultos que podrían ser muy útiles para los usuarios (tanto profesores como estudiantes). Es conveniente identificar tales conocimientos en forma de modelos, patrones o cualquier otro esquema de repre- sentación que permita una mejor explotación del sistema. La minería de datos se revela como la herramienta para lograr tal descubrimiento, dando lugar a la EDM. En este contexto complejo se suelen utilizar distintas técnicas y algoritmos de aprendizaje para obtener los mejores resultados. En este trabajo se estudia, para una asignatura de Informática Teórica, concretamente la asignatura “Teoría de Autómatas y Lenguajes Formales”, cómo predecir el rendimiento académico alcanzado por los estudiantes, a partir de la realización de controles intermedios. Para ello se han aplicado y comparado distintos tipos de algoritmos de aprendizaje (vecinos más cercanos, árboles de decisión, multiclasificadores). Todo el proceso de control y evaluación de los estudiantes durante el curso se ha llevado a cabo a través de la herramienta web denominada SIETTE, desarrollada en nuestro departamento, y que además se utiliza en ámbitos fuera de nuestra propia universidad.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. Este trabajo ha sido parcialmente financiado por el I Plan Propio de Investigacion y Transferencia de la Universidad de Malaga

    Detection of unfavourable urban areas with higher temperatures and lack of green spaces using satellite imagery in sixteen Spanish cities.

    Get PDF
    This paper seeks to identify the most unfavourable areas of a city in terms of high temperatures and the absence of green infrastructure. An automatic methodology based on remote sensing and data analysis has been devel oped and applied in sixteen Spanish cities with different characteristics. Landsat-8 satellite images were selected for each city from the July-August period of 2019 and 2020 to calculate the spatial variation of land surface temperature (LST). The Normalized Difference Vegetation Index (NDVI) was used to determine the abundance of vegetation across the city. Based on the NDVI and LST maps created, a k-means unsupervised classification clustering was performed to automatically identify the different clusters according to how favourable these areas were in terms of temperature and presence of vegetation. A Disadvantaged Area Index (DAI), combining both variables, was developed to produce a map showing the most unfavourable areas for each city. Overall, the percentage of the area susceptible to improvement with more vegetation in the cities studied ranged from 13 % in Huesca to 64–65 % in Bilbao and Valencia. The influence of several factors, such as the presence of water bodies or large buildings, is discussed. Detecting unfavourable areas is a very interesting tool for defining future planning strategy for green spaces

    Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds

    Get PDF
    I. Frías-Blanco, J. d. Campo-Ávila, G. Ramos-Jiménez, R. Morales-Bueno, A. Ortiz-Díaz and Y. Caballero-Mota, "Online and Non-Parametric Drift Detection Methods Based on Hoeffding’s Bounds," in IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 3, pp. 810-823, 1 March 2015 doi: 10.1109/TKDE.2014.2345382. © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Incremental and online learning algorithms are more relevant in the data mining context because of the increasing necessity to process data streams. In this context, the target function may change over time, an inherent problem of online learning (known as concept drift). In order to handle concept drift regardless of the learning model, we propose new methods to monitor the performance metrics measured during the learning process, to trigger drift signals when a significant variation has been detected. To monitor this performance, we apply some probability inequalities that assume only independent, univariate and bounded random variables to obtain theoretical guarantees for the detection of such distributional changes. Some common restrictions for the online change detection as well as relevant types of change (abrupt and gradual) are considered. Two main approaches are proposed, the first one involves moving averages and is more suitable to detect abrupt changes. The second one follows a widespread intuitive idea to deal with gradual changes using weighted moving averages. The simplicity of the proposed methods, together with the computational efficiency make them very advantageous. We use a Naïve Bayes classifier and a Perceptron to evaluate the performance of the methods over synthetic and real data.Supported in part by the SESAAME project number TIN2008-06582-C03-03 of the MICINN, Spain. Supported in part by the AUIP (Asociación Universitaria Iberoamericana de Postgrado)

    The evolution of the ventilatory ratio is a prognostic factor in mechanically ventilated COVID-19 ARDS patients

    Get PDF
    Background: Mortality due to COVID-19 is high, especially in patients requiring mechanical ventilation. The purpose of the study is to investigate associations between mortality and variables measured during the first three days of mechanical ventilation in patients with COVID-19 intubated at ICU admission. Methods: Multicenter, observational, cohort study includes consecutive patients with COVID-19 admitted to 44 Spanish ICUs between February 25 and July 31, 2020, who required intubation at ICU admission and mechanical ventilation for more than three days. We collected demographic and clinical data prior to admission; information about clinical evolution at days 1 and 3 of mechanical ventilation; and outcomes. Results: Of the 2,095 patients with COVID-19 admitted to the ICU, 1,118 (53.3%) were intubated at day 1 and remained under mechanical ventilation at day three. From days 1 to 3, PaO2/FiO2 increased from 115.6 [80.0-171.2] to 180.0 [135.4-227.9] mmHg and the ventilatory ratio from 1.73 [1.33-2.25] to 1.96 [1.61-2.40]. In-hospital mortality was 38.7%. A higher increase between ICU admission and day 3 in the ventilatory ratio (OR 1.04 [CI 1.01-1.07], p = 0.030) and creatinine levels (OR 1.05 [CI 1.01-1.09], p = 0.005) and a lower increase in platelet counts (OR 0.96 [CI 0.93-1.00], p = 0.037) were independently associated with a higher risk of death. No association between mortality and the PaO2/FiO2 variation was observed (OR 0.99 [CI 0.95 to 1.02], p = 0.47). Conclusions: Higher ventilatory ratio and its increase at day 3 is associated with mortality in patients with COVID-19 receiving mechanical ventilation at ICU admission. No association was found in the PaO2/FiO2 variation

    Reducing the environmental impact of surgery on a global scale: systematic review and co-prioritization with healthcare workers in 132 countries

    Get PDF
    Abstract Background Healthcare cannot achieve net-zero carbon without addressing operating theatres. The aim of this study was to prioritize feasible interventions to reduce the environmental impact of operating theatres. Methods This study adopted a four-phase Delphi consensus co-prioritization methodology. In phase 1, a systematic review of published interventions and global consultation of perioperative healthcare professionals were used to longlist interventions. In phase 2, iterative thematic analysis consolidated comparable interventions into a shortlist. In phase 3, the shortlist was co-prioritized based on patient and clinician views on acceptability, feasibility, and safety. In phase 4, ranked lists of interventions were presented by their relevance to high-income countries and low–middle-income countries. Results In phase 1, 43 interventions were identified, which had low uptake in practice according to 3042 professionals globally. In phase 2, a shortlist of 15 intervention domains was generated. In phase 3, interventions were deemed acceptable for more than 90 per cent of patients except for reducing general anaesthesia (84 per cent) and re-sterilization of ‘single-use’ consumables (86 per cent). In phase 4, the top three shortlisted interventions for high-income countries were: introducing recycling; reducing use of anaesthetic gases; and appropriate clinical waste processing. In phase 4, the top three shortlisted interventions for low–middle-income countries were: introducing reusable surgical devices; reducing use of consumables; and reducing the use of general anaesthesia. Conclusion This is a step toward environmentally sustainable operating environments with actionable interventions applicable to both high– and low–middle–income countries
    corecore